Model-based Clustering With Soft And Probabilistic Constraints

نویسندگان

  • Martin H. C. Law
  • Alexander Topchy
  • Anil K. Jain
چکیده

The problem of clustering with constraints has received a lot of attention lately. Many existing algorithms assume the specified constraints are correct and consistent. We take a new approach and model a constraint as a random variable. This enables us to model the uncertainty of constraints in a principled manner. The effect of constraints can be readily propagated to the neighborhood by biasing the search of the optimal parameters in each cluster. This enforces “smooth” cluster labels. The posterior probabilities of these constraint random variables represent the a posteriori enforcement of the corresponding constraints. By combining these probability values with the data likelihood, we arrive at an objective function for parameter estimation. An EM algorithm that maximizes the lower bound of the objective function is derived for efficient parameter estimation, using the variational method. Experimental results demonstrate the usefulness of the proposed algorithm. In particular, our approach can identify the desired clusters when only a small portion of data participate in constraints.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Support vector regression with random output variable and probabilistic constraints

Support Vector Regression (SVR) solves regression problems based on the concept of Support Vector Machine (SVM). In this paper, a new model of SVR with probabilistic constraints is proposed that any of output data and bias are considered the random variables with uniform probability functions. Using the new proposed method, the optimal hyperplane regression can be obtained by solving a quadrati...

متن کامل

Generating Optimal Timetabling for Lecturers using Hybrid Fuzzy and Clustering Algorithms

UCTTP is a NP-hard problem, which must be performed for each semester frequently. The major technique in the presented approach would be analyzing data to resolve uncertainties of lecturers’ preferences and constraints within a department in order to obtain a ranking for each lecturer based on their requirements within a department where it is attempted to increase their satisfaction and develo...

متن کامل

Clustering With Side Information: From a Probabilistic Model to a Deterministic Algorithm

In this paper, we propose a model-based clustering method (TVClust) that robustly incorporates noisy side information as soft-constraints and aims to seek a consensus between side information and the observed data. Our method is based on a nonparametric Bayesian hierarchical model that combines the probabilistic model for the data instance and the one for the side-information. An efficient Gibb...

متن کامل

Modeling of a Probabilistic Re-Entrant Line Bounded by Limited Operation Utilization Time

This paper presents an analytical model based on mean value analysis (MVA) technique for a probabilistic re-entrant line. The objective is to develop a solution method to determine the total cycle time of a Reflow Screening (RS) operation in a semiconductor assembly plant. The uniqueness of this operation is that it has to be borrowed from another department in order to perform the production s...

متن کامل

Novel Radial Basis Function Neural Networks based on Probabilistic Evolutionary and Gaussian Mixture Model for Satellites Optimum Selection

In this study, two novel learning algorithms have been applied on Radial Basis Function Neural Network (RBFNN) to approximate the functions with high non-linear order. The Probabilistic Evolutionary (PE) and Gaussian Mixture Model (GMM) techniques are proposed to significantly minimize the error functions. The main idea is concerning the various strategies to optimize the procedure of Gradient ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004